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Periodically, researchers have been sharing their constant attempts to 
improve the existing methods for data replication in distributed database 
system. The main goal is to work for an efficient distributed environment. 
An efficient environment may handle huge amount of data and preserve data 
availability. The occasionally failures in distributed systems will affect the 
end results, such as data loss, income loss etc. Thus, to prevent the data loss 
and guarantee the continuity of the business, many organizations have 
applied disaster recovery solutions in their system. One of the widely used is 
database replication, because it guarantees data safety and availability. 
However, disaster still can occur in database replication. Hence, an 
automatic failure recovery technique called distributed database replication 
with fault tolerance (DDR-FT) has been proposed in this research. DDR-FT 
uses heartbeat message for node monitoring. Subsequently, a foundation of 
binary vote assignment for fragmented database (BVAFD) replication 
technique has been used. In DDR-FT, the data nodes are continuously 
monitored while auto reconfiguring for automatic failure recovery. From the 
conducted experiments, it is proved that DDR-FT can preserve system 
availability. It shows that DDR-FT technique provides a convenient 
approach to system availability for distributed database replication in real 
time environment. 
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1. INTRODUCTION 


Nowadays, getting access to high quality data can help to enhance the quality of life both for those 
working in the organization and for the people. Hence, it is crucial to ensure the database system is well 
managed and secure. There are two types of database systems which are centralized database system and 
distributed database system. Centralized database system is a database that is located, stored and maintained 
in a single location. When the single database server is crashed or mishap happens, everything will be lost. 
Thus, to increase and preserve the data availability and reliability, it is better to store the data in multiple 
servers rather than in one server [1]-[3]. This method is called distributed database system [1], [4], [5]. 
Hence, in the event of disaster, data availability is ensured because distributed systems have higher reliability 


and incremental growth [6]. 
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Distributed databases systems (DDBS) are a set of logically networked computer databases, 
managed by different sites, locations and accessible to the user locally or via the Internet for different tasks as 
a single database by running transactions in parallel [4], [7]. The design of DDBS is a very demanding task 
since an optimum level of performance must be continuously satisfied [8], [9]. Data availability plays a major 
role in the success of information systems because data must be always available in order to meet the users’ 
requirements. 

Data replication is one of the widely used solutions to ensure data safety and availability in 
distributed database system [10], [11]. It plays a critical role in promoting business for any organisations, 
refining the quality of the data, improving data sharing as well as the process for big data analysis [12]. 
Therefore, the implementation of the data replication method itself can be vital when it involves failure 
interruption. Particularly, failures occur regularly on the internet, clouds and in scale-out data center 
networks [13]-[16]. Hence, a fault tolerance method is needed in data replication transaction [17]-[19]. A 
system using such services are required to be equipped with resources so that the system can guarantee to 
operate even in the presence of faults [17], [20]-[22]. Fault is supposed to be detected by using a reliable 
fault detector followed by a recovery technique. 

In this research, a data replication technique called binary vote assignment for fragmented database 
(BVAFD) is combined with a proposed fault tolerance technique called distributed database system with fault 
tolerance (DDR-FT) with the aimed of assessing the efficiency of synthesizing data replication technique 
with fault tolerance technique for a better performance of a single database replication transaction in the 
event of failure. The paper is arranged as follows. Section 2 is the literature review which detailed out about 
BVAFD data replication technique. In section 3, the methodology describes the procedure of DDR-FT 
algorithm which is employed within BVAFD framework. The result and discussion is presented in section 4 
where the outcomes obtained from a series of experiments that has been tested are discussed in this section. 
Finally, the conclusion of this research is provided in section 5, conclusion. 


2. LITERATURE REVIEW 

Figure 1 shows the concept of binary vote assignment for fragmented database (BV AFD) is copying 
some data from the primary node to some of the adjacent’s nodes [23]. Full replication mechanism may 
waste a lot of of storage space and consume a lot of bandwidth [24] because it copies all data to all 
replication nodes. By using BVAFD, the replication execution time is reduced since it only copies some data 
to some nodes [24], [25]. BVAFD is pacing a new path in distributed database replication as it helps to 
maximize the write availability with low communication cost [26]. 

In BVAFD, each node has a primary database table, PDT. PDT will be copied to the neighbours’ 
nodes, NS from the primary node, PS [23]. PS of any PDT and NS are assigned with vote one (1) or vote 
zero (0). This assignment is treated as an allocation of replicated copies and a vote assigned to the site results 
in a copy allocated at the neighbour [27] PS of any PDT and NS are assigned with different status depends on 
their condition. Status = 0 is shown when PS is accessible. Meanwhile, status = 1 is shown when PS is 
inaccessible or busy. 


Figure 1. Examples of data replication in BV AFD 
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3. RESEARCH METHODS 

In this section, distributed database system with fault tolerance (DDR-FT) is proposed by 
considering fault detection and fault recovery in BVAFD. The monitor node will always monitor all the 
heartbeat message from all the data nodes. Recovery process will only begin automatically once a fault has 
been detected. The following notations are defined: 


— Na is a node with data 

— Nm is a monitor node 

— N'a is neighbour node, where d = 1, 2, 3, 4, or 5 
— Na z is failure node transformed from Ng. 


— Fy is when no failure occurs 
— Fy, is when failure occurs 
— IP, is a primary IP address 


— IP, is a virtual IP address 

All Na where d = {1, 2, 3, 4, 5} continuously send a heartbeat message, to Nm. Hence, N, always 
monitor all heartbeat messages received from N4. If N,, does not receives any heartbeat messages from N4, it 
will assume Ng = Fy, and transform Ng to Ng r Next, N,, will search the available N'a of Ng so Once Nm get 


the N'4, it will assign the N'4 as the backup node for the Ng p- Then, N " will create IP, contains an IP of 
Nay: Now N'a will have two IP addresses which are its own IP, and IP, of Naş- Once Nm detect h=1 from 
Nap, then Nas = Fy — Na, it will delete the IP, from Naps and continue monitoring the nodes. 


4. RESULT AND DISCUSSIONS 

In this section, experiments of cases that occur during real time transactions are presented. It 
involved three replication nodes called node 1, node 2 and node 4 in one distributed database systems. All 
data in these three servers are supposed to be the same replicated data. Therefore, all the data will be updated 
synchronously. 


4.1. Experiment 1: no failure occurs in any nodes 

Figure 2 shows N, = Fy. Nm receives all the heartbeat messages sent from Ng where d = 1, 2 and 4. 
Thus, it will assume no failure occurred. Figure 3 shows monitor node, Nm receive heartbeat from node 1, 2 
and 4. During this time, all Ng are sending their heartbeat message to the Nm without any failure. 


Open ~ M ping.log Swe = - o @ 


perations.log ping.log 

4797 [2020-12-14 14:57:00,802] [ping] [INFO] - Received alive ping from node 

4798 [2020-12-14 3575 [ping] [INFO] - Received alive ping from node 

4799 [2020-12-14 357s [ping] [INFO] - Received alive ping from node 
4800 [2020-12-14 14:57:02,071] [ping] [INFO] - Received alive ping from node 
4801 [2020-12-14 14:57:02,821] [ping] [INFO] - Received alive ping from node 
4802 [2020-12-14 14:57:03,044] [ping] [INFO] - Received alive ping from node 
4803 [2020-12-14 14:57:03,828] [ping] [INFO] - Received alive ping from node 
4804 [2020-12-14 14:57:04,076] [ping] [INFO] - Received alive ping from node 
4805 [2020-12-14 14:57:04,838] [ping] [INFO] - Received alive ping from node 
4806 [2020-12-14 [ping] [INFO] - Received alive ping from node 
4807 [2020-12-14 14:57:05,831] [ping] [INFO] - Received alive ping from node 
4808 [2020-12-14 14:57:06,093] [ping] [INFO] - Received alive ping from node 
4809 [2020-12-14 14:57:06,855] [ping] [INFO] - Received alive ping from node 
4810 [2020-12-14 14:57:07,063] [ping] [INFO] - Received alive ping from node 
4811 [2020-12-14 :57: [ping] [INFO] - Received alive ping from node 
4812 [2020-12-14 [ping] [INFO] - Received alive ping from node 
4813 [2020-12-14 S7; [ping] [INFO] - Received alive ping from node 
4814 [2020-12-14 14:57:09,065] [ping] [INFO] - Received alive ping from node 
4815 [2020-12-14 14:57:09,852] [ping] [INFO] - Received alive ping from node 
4816 [2020-12-14 14:57:10,113] [ping] [INFO] - Received alive ping from node 
4817 [2020-12-14 14:57:10,874] [ping] [INFO] - Received alive ping from node 
4818 [2020-12-14 14:57:11,079] [ping] [INFO] - Received alive ping from node 
4819 [2020-12-14 14:57:11,863] [ping] [INFO] - Received alive ping from node 


Monitor Node 


PRERENDER RPNEARRBNDAREREN DREN ERE 


Figure 2. Data nodes without any failure occurrence Figure 3. Monitor node receive heartbeat 


4.2. Data node 1 held failure occurrence without DDR-FT 
Figure 4 shows N, = Fy > Ni j which means a failure occurs during the transaction. Node | is not 


connected through the network. When this happens, N, f will try to reconnect to the N,, every 5 seconds. 


However, since there is no fault tolerance in this system, this problem will not be solved until it is fixed 
manually. Any transaction will not be able to procced until the problem is solved. 


An automate failure recovery for synchronous distributed database system (Ahmad Shukri Mohd Noor) 


976 o 


W Select Administrator: Command Prompt - py client.py 


Figure 4. Node 1 attempting to send it heartbeat 


4.3. Data node 1 held failure occurrence with DDR-FT 
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Figure 5 shows Ny = Fa > Ni f which means a failure occurs during the transaction. In this 


experiment, DDR-FT has been applied in the BVAFD for failure detection and recovery purposes. Node | is 
not connected through the network which will be detected by monitoring node. The monitor node then will 
begin the recovery process. In this experiment, N, do not send any heartbeat messages to the Nm. As shown 
in Figure 6, Nm does not receive any heartbeat messages from N,. Hence, it will assume N, = Fy and 
transform N; to Ni r Nm Will begin the recovery process. Nm will search for any N'a where d = 2 or 4 to 


backup N4. 


5 
= 5 
G £] 


k 
— 


eee 


Figure 5. Failure nodes without fault tolerance 


operations.log 


1| 2020-12-14 14:43:32,412, [operations] [INFO] 
session ID: bd21af13363d4bacb4f33cfcb480732d 
2 [2020-12-14 14:46:38,238] [operations] [INFO] 
session ID: 8f383be386f641fdbaeb023fd67efac4 
3 [2020-12-14 14:49:03,375] [operations] [INFO] 
session ID: adcb2705c9c34a6587253342d604fbid 
4 [2020-12-14 14:54:36,834] [operations] [INFO] 
session ID: f31757c991ae45618201d4401d54ea89 


5 [2020-12-14 15:00:02,757] [operations] [CRITICAL] - Node 1 disconnected. 


Looking up neighbors for recovery 

6 [2020-12-14 15:00:02,759] [operations] [INFO] 
Assigning 2 for recovery 

7 [2020-12-14 15:00:04,353] [operations] [INFO] 


- Node 1 has joined with 
- Node 1 has joined with 
- Node 2 has joined with 


- Node 4 has joined with 


- Found neighbors : ['2']. 


- Recovery Success by node 2 


with new Virtual IP as: 192.168.0.102. Updating records... 


Figure 6. Failure recovery process from monitor node operations log 
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After Nm found the available neighbor node, for example in this experiment, N’,, N’, then will 
create an IPv of Ni; which is IP address for N1 as shown in Figure 7. Figure 8 shows N’, now have two IP 


addresses, which are IP, = 192.168.0.101 and the IP, = 192.168.0.102. From figure 8, N', have two IP 
addresses which are its own IP address and virtual IP address from the failure node, N, r 


06:17,456] [operations] [INFO] - Received recovery request. Recovering node 1 

06:17,456] [operations] [INFO] - creating virtual IP address: 192.168.0.102 

06:17,456] [operations] [CRITICAL] - netsh interface ipv4 add address “Ethernet@" 192.168.0.102 255.255.255.0 
6:17,782] [operations] [CRITICAL] - Success 

0@6:17,797] [operations] [INFO] - Successfully created secondary IP as 192.168.0.102 Notifying Server... 


Figure 7. Node 2 receive recovery request 


4 UE « Netw... > Network Connections v O Search Network Connections 
Qraanize œ Nicahla thie nahunrk device Diaanaca thic eannection Rename this connection » E- 8 (?] 
g + 
| Networking 
E grea 
q Advanced TCP/IP Settings x 
IP Settings DNS WINS 
Th 
E IP addresses 
5 
a IP address Subnet mask 
8 192. 168.0. 101 255.255.255.0 
fil 192. 168.0. 102 255.255.255.0 
8 
j sae... | | Remove 
< 
= Default gateways: 
| Gateway Metric 
192. 168.0.1 Automatic 
Add... Edit... Remove 
[7] Automatic metric 
| 
1 item =: Æ 
1 
~ 
OK Cancel 


Figure 8. IP address for node 2 


5. CONCLUSION 

This study proposed an automatic failure recovery technique called DDR-FT. DDR-FT focuses on 
the detection of fault and failure recovery. It is run in a replication technique called BVAFD. From the series 
of experiments that have been conducted, it is proved that DDR-FT can preserve system availability in 
BVAFD during the event of a failure. It also shows that DDR-FT provides a convenient approach to data 
availability for distributed database replication in real time environment. However, DDR-FT can be 
improved in many ways. As we know, data consistency is very important in distributed database. Currently 
DDR-FT does not support post recovery process. In future, DDR-FT will handle the data consistency after 
the failure node is operating again. In addition, this method is aimed to be fully implemented in various 
systems and compare the performances against other existing fault tolerance methods. The framework will be 
expanded to cover not only the ability to detect and recover failure, but also allow the application to 
automatically access various recovery and masking mechanisms. 
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